Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workaround for large object size #755

Closed
wants to merge 1 commit into from

Conversation

ycl6
Copy link

@ycl6 ycl6 commented Dec 12, 2024

This PR is to solve the problem when dealing with large object size, by casting total_size as a numeric object when it is stored as an attribute in globals.

When the integer class total_size exceed the .Machine$integer.max limit, users will receive the integer overflow warning message when FutureGlobals calculates size as I quoted below, and size will be NA as the result.

Warning message in size + sum(size_args, na.rm = FALSE):
“NAs produced by integer overflow”

## Total size?
size <- attr(x, "total_size", exact = TRUE)
if (!is.na(size)) {
size_args <- lapply(args, FUN = function(z) {
size <- attr(z, "total_size", exact = TRUE)
if (is.null(size)) NA_real_ else size
})
size_args <- unlist(size_args, use.names = FALSE)
size <- size + sum(size_args, na.rm = FALSE)
}
x <- NextMethod()
attr(x, "resolved") <- resolved
attr(x, "total_size") <- size
x
}

This has been reported when using other R packages that use future for parallelisation, for example satijalab/seurat#9316, satijalab/seurat#9484, jinworks/CellChat#250.

@HenrikBengtsson
Copy link
Collaborator

HenrikBengtsson commented Dec 15, 2024

Thanks for the report and for your troubleshooting.

This problem was introduced in future 1.34.0 (2024-07-29), when we switched from utils::object.size() to parallelly::serializedSize() to calculate object sizes.

Although we could work around it here in future, I think a more reliable approach is to fix parallelly::serializedSize() as planned in futureverse/parallelly#123.

PS. I think an integer overflow can happen even before the location where you suggested the fix, namely in:

total_size <- sum(sizes, na.rm = TRUE)

Closing this PR because it will be fixed in the parallelly package instead.

@ycl6
Copy link
Author

ycl6 commented Dec 15, 2024

Hi @HenrikBengtsson Thanks for nailing the source for making the changes!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants